(19) 



J 



EuropaischB 

European Patent Office 
Office europeen des brevets 




(12) 



(ID EP 1 096 777 A1 

EUROPEAN PATENT APPLICATION 



(43) Date of publication: 

02.05.2001 Bulletin 2001/18 

(21) Application number: 99308537.2 

(22) Date of filing: 28.10.1999 



(51) lntCl7: H04N 1/04 



(84) Designated Contracting States: 


• Grosvenor, David Arthur 


AT BE CH CY DE OK ES Fl FR GB GR IE IT LI LU 


Frampton Cottereil, Bristol BS36 2 AD (GB) 


MC NL PT SE 




Designated Extension States: 


(74) Representative: Lawrence, Richard Anthony et al 


AL LT LV MK RO SI 


Hewlett-Packard Limited, 




IP Section, 


(71) Applicant: Hewlett-Packard Company, 


Building 3, 


A Delaware Corporation 


Filton Road 


Palo Alto, CA 94304 (US) 


Stoke Gifford, Bristol BS34 8QZ (GB) 


(72) Inventors: 




• Cheatle, Stephen Philip 




Bristol BS9 2AU (GB) 





(54) Document imaging system 

(57) The present invention relates to the use of an 
image capture system (1 ) having an electronic camera 
(2) for platenless document imaging. The system (1) 
comprises: an electronic camera (2) with an electronic 
detector (4) and a lens (6) with a field of view (8) for 
imaging on the detector (4) a portion (1 2,14,46) of a doc- 
ument (10); a support (1 6) by which the camera (2) can 
be positioned to view the document (10); an actuator 
(25) for moving (44) the camera field of view (8) so that 
a plurality of overlapping image tiles of a document (1 0) 
can be captured at predetermined different locations 
(12,14,46) over the support surface (20), each image 
tile having an array of tile data points and being subject 



to some expected perspective and/or camera distortion 
relative to the support surface; and electronic process* 
ing means (32) by which the plurality of image tiles may 
be joined into a composite image of the document (1 0). 
The electronic processing means (32) includes a mem- 
ory (41 ) which stores transform data for each image tile, 
the transform data (55,56) relating both to the expected 
distortion and to the predetermined overlap (15) be- 
tween image tiles. The processing means (32) is adapt- 
ed to use the transform data to generate from the tile 
data points a corrected array of tile data points with the 
distortion corrected and with the corrected image tiles 
correctly overlapped to form a composite image of the 
document (10). 
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Description 

[0001] The present invention relates to the use of an 
electronic camera in a platenless document imaging 
system in which the document image is a composite im- 
age formed from a mosaic of overlapping images cap- 
tured by the camera. 

[0002] In recent years, document scanners have be- 
come commonplace. Although these work well and are 
relatively inexpensive, a flatbed or platen-based docu- 
ment scanner occupies a significant amount of scarce 
desk space. 

[0003] The use of a camera to take a photograph of 
a document consisting of text and/or images offers one 
way of dealing with the problem of wasted desk space. 
An electronic camera would need to have a detector with 
about 40 megapixel resolution in order to have a reso- 
lution over an A4 sized document comparable with that 
of the resolution of a typical document scanner, typically 
about 24 dots/mm (600 dpi). Such high-resolution de- 
tectors cost much more than the total cost of a desktop 
scanner. 

[0004] As a result, it has been proposed to use an 
electronic camera with an actuator to scan the field of 
view of the camera over a document, and so form a com- 
posite image of the document from a number of over- 
lapping image tiles. This permits less expensive lower 
resolution detector arrays to be used to build up an im- 
age of a document with a resolution comparable with 
that of a conventional document scanner. See, for ex- 
ample, patent document US 5,515,181 . 
[0005] A problem with this approach is the fact that 
the image tiles must have some overlap, because it is 
impractical to use an actuator which moves the camera 
so precisely that tiles will fit together with no overlap. 
The conventional approach to fitting together overlap- 
ping tiles involves identifying features in the image of 
one tile in an overlap region and matching this against 
a corresponding feature in an adjacent tile's overlap re- 
gion. 

[0006] This feature matching approach suffers from 
various difficulties. First, computational algorithms to 
identify and match features are relatively slow com- 
pared with the process of gathering the images, which 
limits the throughput of a scanning camera document 
imaging system. Second, many documents have signif- 
icant areas of blank space, for which it is not possible to 
match features. This necessitates the use of larger over- 
lap areas to increase the likelihood that there will be suit- 
able matching features in the overlap areas, with the re- 
sult that more images must be captured. Third, it is pos- 
sible that features will be incorrectly matched, particu- 
larly for text based documents in which common letters 
repeat frequently. 

[0007] Another problem is that an image from an in- 
expensive camera will have some image distortion, par- 
ticularly towards the edges of the field of view. The dis- 
tortion is therefore strongest in the overlap region be- 



tween tiles, which makes it more difficult to achieve a 
good overlap simply by matching features. As a result, 
it may be necessary to match several features over the 
extent of the overlap area to get a good fit between ad- 

5 jacent tiles. 

[0008] As a result of problems such as these, scan- 
ning camera-based document imaging systems cannot 
yet compete with flatbed or platen-based document 
scanning systems. 

10 [0009] It is an object of the present invention to pro- 
vide an image capture system using a scanning elec- 
tronic camera that addresses these problems. 
[0010] Accordingly, the invention provides an image 
capture system, comprising: an electronic camera with 

is an electronic detector and a lens with a field of view for 
imaging on the detector a portion of a document; an ac- 
tuator for moving the camera field of view over a docu- 
ment support surface, the camera and the actuator co- 
operating so that a plurality of overlapping image tiles 

20 of a document can be captured at different locations 
over the support surface, each image tile having an ar- 
ray of tile data points and being subject to some expect- 
ed perspective and/or camera distortion relative to the 
support surface; and electronic processing means by 

25 which the plurality of image tiles may be joined into a 
composite image of the document; characterised in that: 

i) for each image tile the camera field of view relative 
to the support surface and the degree of overlap be- 

30 tween neighbouring image tiles are predetermined; 

ii) the electronic processing means includes a mem- 
ory which stores transform data for each image tile, 
the transform data relating both to the expected dis- 

35 tortion and to the predetermined overlap between 
image tiles; and 

iii) the processing means is adapted to use the 
transform data to generate from the tile data points 

40 a corrected array of tile data points with said distor- 
tion corrected and with the corrected image tiles 
correctly overlapped with respect to neighbouring 
corrected image tiles to form a composite image of 
the document 

45 

[001 1 ] The image capture system may include a sup- 
port by which the camera can be positioned to view a 
document support surface on which the document may 
be placed in view of the camera. 

so [001 2] Because the camera field of view relative to the 
document support surface and the overlap are prede- 
termined, the relative orientation and distortion of each 
image tile with respect to its neighbours will be repeat- 
ably the same, to within some residual positioning error 

55 for the camera actuator. Therefore, if the system is used 
to image a document more than one time, without mov- 
ing the document with respect to the document support 
surface, each of the image tiles will be substantially the 
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same with corresponding image tiles from one time to 
the next. 

[0013] Therefore, as long as the positioning and 
movement of the camera is repeatable, transform data 
needs only to be generated and stored once. The trans- 
form data then relates each image data point of each 
image tile to a corrected image data point. The corrected 
image data point of one tile at a point in an overlap area 
will then correspond closely to a corresponding correct- 
ed image data point a similarly overlapping area of an 
adjacent or neighbouring tile. 

[0014] The invention also provides a method captur- 
ing an image of a document using an image capture sys- 
tem according to the invention, in which the method 
comprises the steps of: 

a) positioning the camera above the support sur- 
face and placing a document on the support surface 
within view of the camera; 

b) capturing a plurality of overlapping image tiles of 
the document at different locations over the support 
surface, each image tile having an array of tile data 
points; 

characterised in that in step b) the different locations are 
predetermined so that for each image tile the camera 
field of view relative to the support surface and the de- 
gree of overlap between neighbouring image tiles are 
predetermined, and in that the method comprises the 
steps of: 

c) using the electronic processing means to gener- 
ate from the transform data and the tile data points 
a corrected array of tile data points in order to cor- 
rect said distortion and correctly overlap each cor- 
rected image tile with respect to neighbouring cor- 
rected image tiles; and 

d) joining neighbouring corrected image tiles to form 
a composite image of the document. 

[0015] The system may include a mount by which the 
camera may be mounted over the document support 
surface, which may be a desk or other such work sur- 
face. The mount may position the camera either directly 
above, or above and to one side of the work surface. If 
the camera is mounted to one side of the work surface, 
then the actuator is most conveniently a two-axis tilt and 
pan actuator. 

[0016] Most commonly, the document support sur- 
face will be a work surface, such as a desktop. 
[0017] The accuracy of many types of actuator is lim- 
ited by mechanical play or backlash in the actuator driv- 
ing mechanism. Such imperfections can be minimised 
if the actuator always follows the same pattern of move- 
ment as the camera field of view is moved from a start 
position over the document support surface, and then 



back to the original start position. In this way, the relative 
orientation between image tiles and the degree of over- 
lap between neighbouring tie can be made most accu- 
rate. The absolute orientation of the set of the image tile 
5 with respect to the document support surface is then a 
secondary consideration, as long as perspective distor- 
tion does not change significantly from one pass of the 
camera over a document to the next pass. 
[0018] In a preferred embodiment of the invention, pri- 
or to storing of the transform data in the memory, the 
method comprises the steps of: 

e) providing a two-dimensional registration array 
within the field of view of the camera across an area 
corresponding with the document to be imaged, the 
registration array having a plurality of individually 
identifiable location identification features with a 
predetermined orientation and spacing amongst 
the features; 

f) using the camera to capture a plurality of overlap- 
ping image tiles of the registration array at prede- 
termined locations and predetermined overlap, said 
locations and overlap corresponding to those to be 
used with the document to be imaged, each image 
tile having an array of tile data points that cover a 
plurality of location features; 

g) identifying for each image tile a plurality of indi- 
vidual location identification features, associating 
with said features particular tile data points and from 
the predetermined orientation and spacing of the 
specific features determining from the tile data 
points if there is any image distortion in that image 
tile; 

h) generating from the identity of the location iden- 
tification features and the determined distortion the 
transform data for each image tile. 

[0019] The transform data can therefore be derived 
empirically in an initial calibration of the image capture 
system. Because the calibration data is generated di- 
rectly from the same equipment that will be used to im- 
age the document, the calibration will be naturally close 
to the actual performance of the image capture system. 
It is therefore unnecessary to generate the transform da- 
ta using a mathematical model of the camera and cam- 
era scanning system. In use of the image capture sys- 
tem, the use of the transform data to correct an image 
tile to achieve a correct overlap between neighbouring 
tiles involves relatively little computational effort com- 
pared with aligning image tiles solely by matching iden- 
tifiable features in the imaged document. 
[0020] It is particularly advantageous if, in the gener- 
ation of the transform data, when the camera is used to 
capture a plurality of overlapping image tiles of the reg- 
istration array, the actuator moves the camera between 
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tiles in the same order as for the document to be 
scanned. Therefore, any repeatable imperfections in the 
movement of the camera between image tiles will auto- 
matically be accounted for in the transform data. 
[0021] Preferably, there are at least four location iden- 
tification features that are identified for each image tile. 
[0022] Preferably each image tile captured of the reg- 
istration array has at least one unique location identifi- 
cation feature. Because the spacing and orientation be- 
tween location identification features is known, this then 
allows the separation and orientation between any two 
image tiles of the registration array to be determined 
solely from the image tile data points of the registration 
array. 

[0023] One way of providing a registration array is if 
the location identification features are printed on a card. 
The card may be used just in a manufacturing environ- 
ment. However, if the mount has a location feature for 
correctly orienting the card and document with respect 
to the camera field of view, then the card may be used 
either by a user of the image capture system, or by a 
service engineer should the system need to be recali- 
brated. The location feature may be a right angle bracket 
for aligning a right angle corner of a document to be 
scanned. 

[0024] The electronic camera will have some depth of 
focus defined by the lens, and aperture setting if any. 
The portion of the document being imaged will need to 
lie within this depth of focus in order to achieve optimum 
resolution of the document. If the document is thin, then 
it will effectively He in the plane of the document support 
surface. However, if the document is thick, then the cal- 
ibration of the transform data may not be valid, for ex- 
ample because of different perspective distortion. One 
way to overcome this is if the actuator is arranged to 
rotate the camera about the optical centre of the lens as 
the camera field of view is moved over the support sur- 
face. Then , it is only necessary to have one set of trans- 
form data, as this will apply to different focus displace- 
ment away from that for the document support surface. 
[0025] However, such a lens introduces additional 
cost. Therefore, the actuator may not be arranged to ro- 
tate the camera about the optical centre of the lens as 
the camera field of view is moved over the support sur- 
face. In this case, the camera includes a focus mecha- 
nism, and the transform data includes separate data for 
different focus settings. The method of imaging a docu- 
ment then comprises the additional steps of: focussing 
the camera on the document; and selecting the trans- 
form data according tc the focus setting. The focus set- 
ting may be determined from the lens position, an optical 
focus sensor or by other means, for example an ultra- 
sonic focus detector. 

[0026] The invention will now be described by way of 
example, with reference to the accompanying drawings, 
in which: 

Figure 1 is a schematic drawing of an image capture 



system according to the invention, showing how an 
electronic camera images a document with a pre- 
determined pattern over overlapping image tiles; 

5 Figure 2 is a diagram showing how transform data 
can be used to correct distortion in the image tiles, 
and correctly orient and overlap image tile data in 
neighbouring tiles to form a composite image of the 
document; 

10 

Figure 3 is a preferred embodiment of a registration 
array having numerous unique location identifica- 
tion features used in the generation of the transform 
data; 

15 

Figure 4 is an enlarged diagram of part of Figure 2, 
showing corrected image tile data in the region of 
overlapping image tiles; and 

20 Figure 5 is a flow chart of a preferred method ac- 
cording to the invention for using the image capture 
system to image a document. 

[0027] With reference to Figure 1, an image capture 
25 system 1 for imaging a document 1 0, for example a doc- 
ument of A4 size, includes a conventional electronic 
camera 2 with a CCD detector array 4 having a relatively 
moderate resolution of 480 by 640 pixels. A lens 6 with 
an autofocus mechanism 7 has a field of view 8 directed 
30 at a portion 1 2 of the document 1 0, shown in outline and 
cross-hatching. A total of thirty-six such portions cover 
the document 10. Each portion overlaps slightly with its 
neighbouring portions, as shown by the different angled 
cross hatching shown for one neighbouring portion 14. 
35 The tiles 1 2 and 1 4 therefore overlap in an overlap area 
15. 

[0028] The camera 2 is mounted atop a stand 1 6 that 
is affixed to an edge 1 8 of a work surface 20. The stand 
1 6 has at its base a right-angled bracket 22 that is used 

40 both to fix the stand 1 6 to the work surface 20 and to 
align correctly the document 1 0 with respect to the cam- 
era 2. A cylindrical post 24 extends vertically from the 
right-angled corner of the bracket 22. Between the top 
of the post 24 and the camera 2 is a motorised actuator 

45 mechanism 25. The mechanism 25 comprises atop the 
post 24 a cylindrical joint 26 that is coaxial with the post 
24, and above this an arcuate tilting arm 28 which is con- 
nected to the base of the camera 2. The actuator mech- 
anism is connected by a ribbon cable 30 to a controller 

so unit 32, which may, for example, be an expansion card 
in a personal computer. The cylindrical joint 26 can ro- 
tate 34 the camera about a vertical axis, and the arcuate 
arm 28 can rotate 36 the camera about a horizontal axis. 
[0029] Alternatively, if the camera is mounted directly 

55 above the document, for example being roughly cen- 
tered above the document, then the actuator may have 
two horizontal axes of rotation. 
[0030] The controller unit 32 is connected by a second 
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ribbon cable 37 to the electronic camera 2 in order to 
control the operation of the camera, and in particular as 
shown in Figure 2, to download from the camera thirty- 
six overlapping image tiles 51 ,52,53 taken of the docu- 
ment 10. The controller unit 32 will have a microproces- 
sor (\iP) 40, and a memory (RAM) 41 and may, of 
course, be connected to a conventional removable stor- 
age device and display (not shown). 
[0031] Optionally, some of the functions of the con- 
troller unit 32, such as the microprocessor 40 and mem- 
ory 41 may be incorporated in the electronic camera 2. 
[0032] As shown in Figure 1 , the camera lens 6 is in 
a start position in which the lens 6 looks downwards 
nearly vertically onto the portion 1 2 of the document 1 0 
closest to the corner of the angle bracket 22. The cam- 
era 2 is stationary when the controller unit 32 captures 
an image of each document portion 12,14. Between 
captured images the actuator 25 is controlled by the 
controller unit 32 to move between document portions 
according to a predetermined pattern, illustrated by 
head-to-tai! arrows 44, until a last document portion 46 
is reached, whereupon the camera 2 is moved back to 
the start position, as shown by dashed line arrow 48. 
During this movement, the overlap areas 15 between 
each document portion 12,14,46 are predetermined, as 
are therefore, also the corresponding overlap areas 50 
between image tiles 51 ,52,53. 

[0033] Because none of the document portions 
1 2,14,46 is presented directly face-on to the lens 38, the 
captured image of each image tile 51,52,53 will have 
some perspective distortion, also called "keystone" dis- 
tortion. Therefore, both the document portions 12,14,46 
and the overlap areas 15 will not in general be rectan- 
gular (as drawn for clarity), but trapezoidal. In addition, 
unless an expensive lens is used, the image on the de- 
tector 4 will in general have some lens distortion, most 
notably radial distortion. 

[0034] I n principle, the expected distortions of the im- 
age tiles can be calculated from a theoretical model of 
the imaging system and by measuring carefully the ac- 
tual position of the lens 6. In practice, it is difficult to ac- 
curately model the performance of the lens 6 with re- 
spect to the document 1 0. Very small errors in the model 
can generate significant discontinuities in the composite 
image if the correction transforms are derived from the 
model. 

[0035] Figures 2, 3 and 4 show how empirically de- 
rived transform data can be used to perform the neces- 
sary corrections, provided that the overlap areas 1 5 and 
orientations between imaged document portions 
12,1 4,46 are repeatable each time the overlapping doc- 
ument images 51 ,52,53 are captured. In Figure 2, each 
of the thirty-six image tiles 51 ,52,53 captured by trie de- 
tector array 4 has pixels whose (x.y) co-ordinates are 
represented by (x 1 y 1 ) to (x^.y 36 ). Each of the image 
tiles 51 ,52,53 can make a correct contribution to a com- 
posite image 54 of the document 10 after a transform 
operator (T) has transformed 55 the (x,y) co-ordinates 



to those (x,y) for corresponding corrected image tiles 
151,152,153. Each of the corrected image tiles 
151 ,152,153 will then have predetermined overlap are- 
as 1 50 that correspond with overlap areas 50 of the orig- 
5 inal image tiles 51 ,52,53. 

[0036] The transform data is empirically derived with 
the use of a registration array 60 shown in Figure 3, 
which may be printed on a portable substrate such as 
paper or card. 

[0037] The registration array 60 in use is slightly larger 
than A4 size and comprises a square array of circular 
features 61 , which have been devised so that pattern 
recognition software can both quickly determine the 
centre of each circular feature 61 , and the identity of 
each feature. Because the layout of the pattern is 
known, the orientation and spacing between the centres 
of any two of the circular features 61 can be calculated. 
On a six-by-six array, over the A4 registration array 60, 
each image tile 51 ,52,53 will have at least about 70 such 
circular features 61 . If it is desired to image documents 
larger or smaller than A4 size, then of course the regis- 
tration array can be made larger. 
[0038] So that these features 61 are individually iden- 
tifiable, the array has at regular intervals on a square 
grid a square grouping 62 of four variable and identifia- 
ble circular patterns, each of which consists of one of 
eight different possible patterns of alternating white and 
black concentric rings or circles. Image processing soft- 
ware can readily measure the extent and number of the 
circular white and dark bands, and so unambiguously 
determine which of the eight possible circular features 
has been identified, as well as identifying the centre of 
each of the variable features 62. As a check on the ve- 
racity of the identified pattern, in none of the groupings 
62 of four circular features does the same type of ring 
feature appear more than once. Any particular pattern 
can only validly occur in one of the four possible rota- 
tional orientations. There are therefore (8.7.6.5)/4 = 420 
different possible combinations, only 155 of which are 
used in the illustrated registration array 60. 
[0039] The groupings of four identifiable circular fea- 
tures 62 are separated by pairs of rows and columns of 
uniform circular features 63. The spacing of the identi- 
fiable groupings 62 is such that there is at least one, and 
in the present example at least four, such groupings in 
each captured image tile of the registration array 60. 
[0040] Once the identifiable groupings 62 have been 
identified in such a captured image tile 51 ,52,53, the oth- 
er uniform features 63 can be identified from the known 
arrangement of these uniform features 63. The result is 
that each captured image tile51 ,52,53 of the registration 
array 60 will have approximately 70 to 80 identification 
features, indicated for simplicity below simply as the 
number 70. 

[0041] Figures 2 and 4 show how the apparent loca- 
tions of the registration array features, designated (x n , 
y") for the nth image tile, are used to generate transform 
data that allows the captured image tiles 51 ,52,53 to be 
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transformed (T) 55 to a composite image 54 of the doc- 
ument 10. First, the apparent locations of the registra- 
tion array elements (x",^^ on the detector array are 
deduced from the circular shape of each location iden- 
tification feature 61. These locations (x n ,y n ) 1 . 70 will not 
in general coincide with the centres of the pixel elements 
58 of the detector array 4. The problem to be solved is 
how to transform these apparent locations into 'true' lo- 
cations (xP,y)i.7o for tne composite image 54, using 
the known position and orientation of the registration ar- 
ray features 61 . 

[0042] Although the number of location identification 
features 61 identified for each captured image tile 
51,52,53 is preferably at least four, a more accurate 
transform T can be generated if the number of location 
identification features 61 is about 60 to 80 for each im- 
age tile 51 ,52,53. The reason for this is as follows. 
[0043] Given a set of k point correspondences of the 
form (x n ,y n ) 1 . k to (x", y")^ where k > 4, we require a 
perspective and distortion transform model which trans- 
forms each point (x,y) into its corresponding point fay). 
[0044] Using homogeneous co-ordinates, as is stand- 
ard practice in computer graphics and image warping, 
we can represent this transform by a three-by-three ma- 
trix: 
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where x = u/w and y = v/w. 

[0045] The matrix has eight unknowns a^.-.a^. Mul- 
tiplying out this equation we have: 

x = ( a u x + a 12 -y + a 13 ) / ( a 31 x + a 32 y + 1 ) 

and 

y= ( a 2r x + a 22 -y + a 23 ) / ( a 31 -x + a 32 y + 1 ) 

[0046] Each point correspondence (x,y) to (x,y) thus 
provides two linear equations in the unknowns a^ to 
a 32 . A set of k correspondences where k > 4 produces 
an over-constrained linear system of equations. A least 
squares fitting method such as Single Value Decompo- 
sition can be used to determine a solution for a^ to a^ 
which best fits the observed correspondences. 
[0047] Once the transform T has been determined, 
the captured image tiles 51 ,52,53 can be warped ac- 
cording to any of a number of known image warping 
techniques. See for example, "Digital Image Warping", 
G. Wolberg, IEEE Computer Society Press 1990. 
[0048] One approach is to express the transform T as 



a matrix. The captured image 51 can then be warped 
into the corrected image 54 by first inverting matrix T to 
generate a reverse transform (R) 56 which maps points 
in the corrected image 54 to the original image 51 . With 

5 reference to Figure 4, the co-ordinates (X,,Yj) of each 
pixel 158 to be assigned a value in the corrected image 
54 can be transformed 56 by inverse matrix R to give 
the corresponding location in the original image 51 . This 
location will not in general correspond exactly with a pix- 

10 el 58 in a captured image tile 51 ,52,53. The image in- 
tensity at this location can be determined by one of a 
number of interpolation methods such as bi-linear inter- 
polation or bi-cubic interpolation. The process is repeat- 
ed for all pixel locations 158 in the corrected image 54 

is which, when transformed by R, are defined in the origi- 
nal image tiles 51 ,52,53. In areas of the corrected image 
54 which are covered by the overlap of two or more cor- 
rected image tiles 151,152,153, the pixels 158 can be 
either selected from just one of the original image tiles 

20 51 ,52,53, or be a blend or average of more than one of 
these image tiles. 

[0049] Figure 5 is a flowchart describing an image 
capture process 70 including an initialisation process in 
which the transform data T,R may be generated, or re- 

25 generated if required. If the image capture system 1 1 
needs to be calibrated 71 , for example during initial cal- 
ibration when the image capture system is manufac- 
tured, then the registration array 60 is placed 72 in view 
of the camera 2 in the same orientation as the document 

30 10 to be imaged. It does not matter if the registration 
array 60 is larger than the document 1 0, but if the reg- 
istration array 60 is smaller, then it is not possible to gen- 
erate a full set of transformation data T, R for each image 
tile 51,52,53. 

35 [0050] Next, the camera 2 is used 73 to capture over- 
lapping image tiles 51 ,52,53 of the registration array 60 
in the same manner as the document 1 0 to be imaged. 
[0051] As described above, the processor 40 is then 
used 74 to identify in each of the captured images 

40 51 ,52,53 at least four individual location identification 
features 61 . For greater accuracy, these at least four 
features should be spread over a substantial portion 
such as at least half of the captured image tile. Because 
the actual spacing and orientation of the location iden- 

45 tif ication features 61 is known in advance, the transform 
data T,R to account for distortion and overlap of each 
image tile 51 ,52,53 can be generated 75 from the ap- 
parent location of the features 61 on the detector array 
4. The calibration process is complete when the trans- 

50 form data T.R for each image tile 51 ,52,53 is stored 76 
in memory 41 . 

[0052] Once calibration is complete, or if no calibra- 
tion is needed 77, the document 1 0 can then be placed 
78 in view of the camera 2, in the same area as was 
55 covered by the registration array 60. The camera 2 is 
then used 79 to capture overlapping image tiles 
51,52,53 of the document 10 in the same manner as 
was done for the registration array 60. The transform 
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data T is recalled 80 from memory 41 and used to gen- 
erate 55 corrected Image tiles 1 51 , 1 52, 1 53 with the cor- 
rect overlap and orientation with respect to neighbouring 
corrected image tiles. Finally, the corrected image tiles 
151 ,152,153 can be joined 81 into a composite image 5 
54 of the document 10. 

[0053] The mechanism can operate at a speed com- 
parable to a flatbed scanner. Much of the calculation can 
be done during the time of the mechanical movement 
and data transfer. 

[0054] The image capture system described above 
provides an economical and practical solution to the 
problems of how to use an inexpensive electronic cam- 
era to generate a higher resolution Image of a docu- 
ment Image transform data is empirically derived for the 
system, for example using a registration array, in the 
same manner the system is used to image a document. 
In particular, an inexpensive actuator can be used as 
long as the positioning of the actuator Is repeatable 
when this is moved between image tiles in a predeter- 
mined order. The achievable resolution and time taken 
by such a system to image a document compares fa- 
vourably with a flatbed scanner, while of course work 
surface space can be freed for other uses when the im- 
age capture system is not in use. 



Claims 

1 . An image capture system (1 ), comprising: an elec- 
tronic camera (2) with an electronic detector (4) and 
a lens (6) with a field of view (8) for imaging on the 
detector (4) a portion (1 2,14,46) of a document (1 0); 
an actuator (25) for moving (44) the camera field of 
view (8) over a document support surface (20), the 
camera (2) and the actuator (25) co-operating so 
that a plurality of overlapping image tiles (51 ,52,53) 
of a document (1 0) can be captured at different lo- 
cations (12,14,46) over the support surface (20), 
each image tile (51 ,52,53) having an array of tile 
data points (58) and being subject to some expect- 
ed perspective and/or camera distortion relative to 
the support surface (20); and electronic processing 
means (32) by which the plurality of image tiles 
(51 ,52,53) may be joined into a composite image 
(54) of the document (10); characterised in that: 

i) for each image tile (51 ,52,53) the camera field 
of view (8) relative to the support surface (20) 
and the degree of overlap (50) between neigh- 
bouring image tiles (51,52,53) are predeter- 
mined; 

ii) the electronic processing means (32) in- 
cludes a memory (41) which stores transform 
data (55,56) for each image tile (51 ,52,53), the 
transform data (55,56) relating both to the ex- 
pected distortion and to the predetermined 



overlap (50) between image tiles (51,52,53); 
and 

iii) the processing means (32) is adapted to use 
the transform data (55,56) to generate from the 
tile data points (58) a corrected array of tile data 
points (158) with said distortion corrected and 
with the corrected image tiles (151,152,153) 
correctly overlapped (150) with respect to 



10 neighbouring corrected image tiles 

(151 ,152,153) to form a composite image (54) 
of the document (10). 

2. A method of capturing an image of a document us- 
15 ing an image capture system (1 ) as claimed in Claim 

1 , in which the method comprises the steps of: 

a) positioning the camera (2) above the docu- 
ment (10) support surface (20) and placing a 

20 document (1 0) on the support surface (20) with- 

in view of the camera (2); 

b) capturing a plurality of overlapping image 
tiles (51 ,52,53) of the document (1 0) at different 

25 locations (12,14,46) over the support surface 

(20), each image tile (51 ,52,53) having an array 
of tite data points (58); 

characterised in that in step b) the different loca- 
30 tions (1 2,14,46) are predetermined so that for each 
image tile (51,52,53) the camera field of view (8) 
relative to the support surface (20) and the degree 
of overlap (50) between neighbouring image tiles 
(51 ,52,53) are predetermined, and in that the meth- 
35 od comprises the steps of: 

c) using the electronic processing means (32) 
to generate from the transform data (55,56) and 
the tile data points (58) a corrected array of tile 

40 data points (1 58) in order to correct said distor- 

tion and correctly overlap each corrected image 
tile (151 ,152,1 53) with respect to neighbouring 
corrected image tiles (151 ,152,153); and 

45 d) joining neighbouring corrected image tiles 

(151 ,152,1 53) to form a composite image of the 
document (10). 

3. A method of capturing an image of a document as 
so claimed in Claim 2, in which prior to storing of the 

transform data (55,56) in the memory (41), the 
method comprises the steps of: 

e) providing a two-dimensional registration ar- 
55 ray (60) within the field of view (8) of the camera 

(2) across an area corresponding with the doc- 
ument (10) to be imaged, the registration array 
(60) having a plurality of individually identifiable 
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location identification features (61 ) with a pre- 
determined orientation and spacing amongst 
the features (61) ; 

f) using the camera (2) to capture a plurality of 5 
overlapping image tiles (51 ,52,53) of the regis- 
tration array (60) at predetermined locations 
(12,14,46) and predetermined overlap (15), 
said locations and overlap corresponding to 
those to be used with the document (1 0) to be 10 
imaged, each image tile (51 ,52,53) having an 
array of tile data points (58) that cover a plural- 
ity of location features (61) ; 



(2) includes a focus mechanism (7), and the trans- 
form data (55,56) includes separate data for differ- 
ent focus settings, the method comprising the steps 
of: 

i) focussing (7) the camera (2) on the document 
(10); and 

j) selecting the transform data (55,56) accord- 
ing to the focus setting. 



g) identifying for each image tile a plurality of is 
individual location identification features (61), 
associating with said features particular tile da- 
ta points (58) and from the predetermined ori- 
entation and spacing of the specific features 

(61 ) determining from the tile data points (58) 20 
if there is any image distortion in that image tile 
(51,52,53); 

h) generating from the identity of the location 
identification features (61) and the determined 25 
distortion the transform data (55,56) for each 
image tile (51 ,52,53). 



4. A method of capturing an image of a document as 
claimed in Claim 3, in which in which step 0 involves 3Q 
using the actuator (25) to move the camera (2) be- 
tween image tiles (51 ,52,53) in the same order as 

for the document (10) to be imaged. 

5. A method of capturing an image of a document as 35 
claimed in Claim 3 or Claim 4, in which step g) in- 
volves identifying at least four such location identi- 
fication features (61). 

6. A method of capturing an image of a document as *o 
claimed in any of Claims 3 to 5, in which each image 

tile (51 ,52,53) captured of the registration array (60) 
has at least one unique location identification fea- 
ture (62). 

45 

7. A method of capturing an image of a document as 
claimed in any of Claims 3 to 6, in which the location 
identification features (61) are printed on a card. 

8. A method of capturing an image of a document as so 
claimed in any of Claims 2 to 7, in which the actuator 
(25) is arranged to rotate (34,36) the camera (2) 
about the optical centre of the lens (6) as the cam- 
era field of view (8) is moved over the support sur- 
face. 55 

9. A method of capturing an image of a document as 
claimed in any of Claims 2 to 7, in which the camera 
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Use processor to identify in each of the captured images at least four individual location 
identification features 
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Determine from image features distortion in each image tile and the relative orientation 
of neighbouring image tiles, and from this generate transform data for each image tile 
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Store the transform data for each image tie in memory 
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Place the document to be imaged within view of the camera 
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Use camera to capture overlapping Image tiles of the document in same manner as was 
done for the registration array 
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Recall the transform data for each tUe and use this to generate corrected image tiles with 
correct overlap with neighbouring corrected image tiles 
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I Join neighbouring corrected image tiles to form a composite image of the document 



EP1 096 777 A1 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 99 30 8537 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, whore appropriate, 
of relevant passages 



fleievani 
to claim 



CLASSIFICATION OF THE 
APPLICATION (lnt.CI.7) 



X 
A 



US 4 485 409 A (SCHUMACHER PETER M) 
27 November 1984 (1984-11-27) 
the whole document * 

EP 9 743 784 A (SHARP KK) 
20 November 1996 (1996-11-20) 
abstract * 

column 1, line 11 - column 3, line 37 * 

* column 4, line 31 - column 11, line 2 * 

WO 84 02046 A (RIDGE WARREN J ; ROBERTS 
DENNIS C (US)) 24 May 1984 (1984-05-24) 
abstract * 

* page 4, line 2 - page 6, line 30 * 



1-3 
4-9 
1-7 



1-9 



H04N1/04 



TECHNICAL FIELDS 
SEARCHED (lnt.CI.7) 



H04N 



The present search report has been drawn up for ail claims 



THE HAGUE 



D4» of comptation of the search 

24 March 2000 



Examiner 

Hubeau, R 



CATEGORY OF CITED DOCUMENTS 

X : particularly relevant 9 taken alone 

Y : parthxitarry relevant if combined with another 

document of the seme category 
A : technological background 
O : non-written disclosure 
P : intermediate document 



T : theory or principle underlying the invention 
E : earlier patent document, but pub fa bed on, or 

after the fling dote 
D ! document orted in the application 
L : document cited for other reasons 



» patent famfly, corresponding 



12 



EP 1 096 777 A1 



ANNEX TO THE EUROPEAN SEARCH REPORT 
ON EUROPEAN PATENT APPLICATION NO. 



EP 99 30 8537 



This annex fasts the patent family members relating to the patent documents cited in the above-mentioned European search report. 
The members are as contained in the European Patent Office EDP file on 

The European Patent Office is in no way liable for these particulars which are merely given for the purpose of information. 

24-03-2000 



Patent document 
cited in search report 


Publication 
date 


Patent family 
member(s) 


Publication 
date 


US 4485409 


A 


27-11-1984 


All 


1608483 A 


24-10-1983 








CA 


1196417 A 


85-11-1985 








DK 


545383 A 


29-11-1983 








EP 


0104254 A 


04-04-1984 








FI 


834343 A 


28-11-1983 








NO 


834370 A 


28-11-1983 








WO 


8303516 A 


13-10-1983 








ZA 


8302196 A 


30-05-1984 


EP 743784 


A 


20-11-1996 


JP 


8317275 A 


29-11-1996 








US 


5880778 A 


09-03-1999 


WO 8402046 


A 


24-05-1984 


EP 


0125238 A 


21-11-1984 



i For more details about this annex : see Official Journal of the European Patent Office, No. 12/82 



13 



THIS PAGE BLANK (uspto) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 
^BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

^LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHffilT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



THIS PAGE BLANK (uspto) 



